Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Protein function prediction method based on PPI network and machine learning
TANG Jiaqi, WU Jingli
Journal of Computer Applications    2018, 38 (3): 722-727.   DOI: 10.11772/j.issn.1001-9081.2017082042
Abstract715)      PDF (948KB)(588)       Save
Aiming at the problem that the prediction method of protein function based on the current Protein-Protein Interaction (PPI) network has low precision and is susceptible to data noise, a new machine learning protein function prediction method named HPMM (HC, PCA and MLP based Method) was proposed, which combined Hierarchical Clustering (HC), Principal Component Analysis (PCA) and Multi-layer Perception (MLP). HPMM took comprehensive consideration from macro and micro perspectives. It combined the information of protein families, domains and important sites into the vertex attributes of PPI networks to alleviate the effect from the data noise of networks. Firstly, the features of function modules and attribute principal components were extracted by using HC and PCA. Secondly, a mapping relationship between multi-feature and multi-function, used to predict protein functions, was constructed by training the MLP model. Three homo sapiens PPI networks, which were annotated by Molecular Functions (MF), Biological Processes (BP), and Cellular Components (CC) respectively, were adopted in the experiments. Comparisons were performed among the HPMM algorithm, the Cosine Iterative Algorithm (CIA) and the Diffusing GO Terms in the Directed PPI Network (GoDIN) Algorithm. The experimental results indicate that HPMM can obtain higher precision and F-measure than algorithms CIA and GoDIN, which are purely PPI network based methods.
Reference | Related Articles | Metrics
Simulated annealing algorithm for solving the two-species small phylogeny problem
WU Jingli, LI Xiancheng
Journal of Computer Applications    2016, 36 (4): 1027-1032.   DOI: 10.11772/j.issn.1001-9081.2016.04.1027
Abstract536)      PDF (872KB)(412)       Save
In order to solve the two-species Small Phylogeny Problem (SPP) in the duplication-loss model, a simulated annealing algorithm named SA2SP was devised for the duplication-loss alignment problem. An alignment algorithm was introduced to construct the initial solution; a labeling algorithm was used to construct the object function and obtain the evolution cost; and three intelligent neighborhood functions were introduced to generate neighborhood solutions by using the evolutionary characteristics of gene sequences. The ribosomal RiboNucleic Acid (rRNA) and transfer Ribonucleic Acid (tRNA) of four real bacterium were used to test the performance of SA2SP and Pseudo-Boolean Linear Programming (PBLP) algorithm. The experimental results show that the SA2SP algorithm has smaller evolution cost, and it is an effective method for solving the two-species SPP in the duplication-loss model.
Reference | Related Articles | Metrics